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THE FIRST STAGE OF A RESEARCH PROJECT INVESTIGATING 
REINFORCER PREFERENCES IN DEVELOPMENTAL RETARDATES IS 
DESCRIBED. THE SUBJECTS, 12 MALES AND THREE FEMALES 
(CHRONOLOGICAL AGE 10 TO 22, MENTAL AGE 2.8 TO 8.7) , WERE 
PRESENTED WITH A TASK IN WHICH 35MM COLOR SLIDES WERE 
PROJECTED ONTO A CONSOLE WINDOW. RESPONSES REQUIRED SUBJECTS 
TO CHOOSE AMONG FOUR RE I NFORCERS--M/M CANDIES, CHEER lOS, 
TRINKETS, AND PENNIES. RESULTS INDICATED THAT MOST SUBJECTS 
TENDED TO DISTRIBUTE THEIR RE INFORCER CHOICE RESPONSES IN ONE 
OF TWO WAYS — (1) CHOICES WERE INITIALLY DISTRIBUTED OVER THE 
FOUR REINFORCERS, AND WITHIN SIX SESSIONS ONE REINFORCER 
BECAME MORE FREQUENTLY SELECTED AND (2) A PARTICULAR 
REINFORCER WAS INITIALLY SELECTED WITH HIGH FREQUENCY, AND A 
SECOND RE INFORCER DEVELOPED AS A LOW FREQUENCY CHOICE. OTHER 
RESPONSE PATTERNS WERE ALTERNATION ON A CYCLICAL BASIS AND 
VARIABILITY OF CHOICE NOT BECOMING STABLE UNTIL THE 25TH 
SESSION. FURTHER REFINEMENT OF METHODOLOGY IS INDICATED. 
EIGHTEEN GRAPHS AND FOUR REFERENCES ARE INCLUDED. (DT) 
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THIS IS A WORKING REPORT ON THE FIRST 
STAGE OF A RESEARCH PROJECT IN THE 
EXPERIMENTAL ANALYSIS OF REINFORCER 
HIERARCHIES IN DEVELOPMENTAL RETARDATES. 
DATA AND DISCUSSION ARE RESTRICTED TO 
THE DEVELOPMENT OF A METHODOLOGY, AND 
SHOULD NOT BE INTERPRETED AS ANALYSIS 
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THE EXPERIMENTAL ANALYSIS OF REINFORCER HIERARCHIES IN 
DEVELOPMENTAL RETARDATES: BASELINE STABILIZATION 

Robert Orlando and Russell M. Tyler 

Institute on Mental Retardation and Intellectual Development 
George Peabody College for Teachers 



The behavior of an individual is viewed as the product of a continuous 
interaction between the individual and his environment. At any point in 
time# behavior is viewed as the product of the interaction between current 
environmental factors and the behavioral characteristics of the individual. 
Among the environmental factors which operate on behavior# reinforcement# 
the consequence of behavior# is prominent. 

Although differences between individuals with respect to effective 
reinforcement consequences are recognized# it frequently is assumed that 
specific reinforcing events maintain their effectiveness for a given indi- 
vidual over a long period of time. It further is assumed that there 
generally is a high degree of congruence between the kinds of effective 
reinforcers for an individual and the reinforcers present in the environment. 
It appears that data bearing on these assumptions are necessary for a 
more adequate understanding of factors influencing the interactions between 
behavior and the environment. 

Knowledge of these factors particularly is important in the case of 
retarded individuals. It frequently is noted that the retardate's behavior 
is unusually impervious to those environmental consequences generally 
effective as reinforcers for the majority of individuals . Some of the 
slow learning and maladaptive behavior of the retardate may be accounted 
for by this imperviousness# the instability of events as effective rein- 
forcers # or the lack of congruence between environmental events and 
those which are functionally effective for the specific retardate. 

The first problem encountered in the assessment of the relative 
effectiveness of a set of reinforcing events is the development of a method- 
ology which meets the following criteria: unconfounded evaluation of 
a number of reinforcing events concurrently; reliability of assessment over 
repeated measures; and sufficient stability and sensitivity to permit 
classification and parameter analysis within individual subjects. The 
usual method has been that of the paired comparisons approach# in which 
individual subjects are asked to select one reinforcer in each of all possi- 
ble pairings of a set of reinforcers (Schutz & Naumoff# 1964#* Tyrrell# 
Witryol# & Silverg# 1963; Witryol & Fischer# 1960). Although repeated 
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measures with individual subjects are feasible (Tyrrell, 1963), the 

period of time over which measures have been taken appears to be of insuf- 
ficient duration to permit the evaluation of long -term changes in the 
behavior. Further, choice responses are made in pairs and although the 
methodology provides for comparisons of behavior with all possible pairs, 
at no time is the behavior sampled with all reinforcers in the set available. 
In view of the fact that individuals do not experience the delivery of the 
selected reinforcer as a consequence of their choice, it is difficult to 
assess what the consequence of choice is, and v/hat effect it has on choice 
behavior . 

The first objective of this research program was to develop a m.ethod- 
ology for reliable and repeated assessment of the ranking of the relative 
effectiveness (by means of choice-response data) within a specific set of 
events for individual subjects. Events were chosen which were both 
representative of those usually found to be reinforcing for members of the 
retarded population (M & Ms , Cheerios, trinkets and pennies) and repre- 
sentative of three major classes of tangible events (consumables, 
manipulables and generalized reinforcers) . The procedure was so designed 
that each subject is afforded repeated opportunities to select and receive 
one of the reinforcers , from an array of all four . 

The second objective was to obtain information on the organization 
and long-term stability of the choice behavior of a small sample of retarded 
individuals under these conditions . The third was to obtain information 
about the methodology itself, in order to develop a methodology which 
meets the criteria previously enumerated and provides a baseline behavior 
appropriate for the analysis of the parameters of reinforcer hierarchy and 
the relationships between these parameters and the acquisition and main- 
tenance of complex behavior s . 



METHOD 



Subjects 

Subjects were residents of a state institution for the retarded^ . Fif- 
teen individuals, 12 males and three females, were selected from the 

^Clover Bottom Hospital and School, Donelson, Tenn., George L. 
Wadsworth , M . D . , Superintendent » 
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population who met the following criteria;; no gross sensory-motor hand- 
icaps; no severe behavioral problems; no history of chronic illness; MA 
two or more yrs and CA 10 to 2 5 yrs . The MA range of ^s was 2.75 to 
8.67 yrs, mean 5.38 yrs, and the CA range was 10 to 22 yrs, mean 15 
yrs . 

Apparatus 

The apparatus was a modification of the multiple-choice visual 
discrimination apparatus described in detail by Hively (1964). It con- 
sisted of a large, wall-mounted console, on the face of which were one 
large, rectangular windov/and, below, four smaller windows . Stimuli 
were rear-projected onto these windows which were connected to 
micro-switches. Pressure on any window resulted in micro-switch 
closure . A reinforcer receptacle w’as located at the lower right-hand 
corner of the console, and each delivery of a reinforcer was accompanied 
by a 3-sec illumination of the receptacle and the simultaneous opera- 
tion of a buzzer. 

The console was located in a sound-attenuated room containing two 
chairs, a one-way observation vyindow and an intercom for auditory 
monitoring . Fully automated programming and recording apparatus were 
located in an adjoining control room. 

Stimuli were 35 mm color transparencies projected on the console 
windows . Two types of stimuli were presented alternately; one 
projected green light onto all five windows (access stimulus) and the 
other projected pictures of the four reinforcers (M & Ms , Cheerios , 
trinkets and pennies), with a different reinforcer pictured on each of 
the smaller windows . Positions of the reinforcers varied from slide 
to slide . 

Procedure 

Subjects were seen individually one to three times weekly. They 
were led from the laboratory waiting room by E with the instruction, 
"Come with me, " and brought into the experimental room. In the first 
session, E stood to the left of the console, instructed ^ to sit in the 
chair in front of the console, and said, "Watch me." With the access 
stimulus (green light) on, E made three, discrete, paced responses on 
the larger (access) window which were followed by removal of the access 
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stimulus and the presentation of the choice sNmuius {one of the 24 possi- 
ble arrangements of the reinforcer pictures). The ^ was instructed, ‘'Get 
whatever you want, and if he failed to respond appropriately on one of 
the choice windows within approximately 5 sec, the instruction was 
repeated. Subjects who failed to respond with two repetitions of the 
instruction v/ere told, “Push on the one you want." Two consecutive 
responses on one of the choice windows were required.. If ^ failed to 
emit two consecutive responses, the instruction to ‘Push on the one you 
want'' was repeated. None of the ^s failed to respond under these 
instructions . 



FoDowing the emission of two consecutive responses on one window, 
one unit of the reinforcer pictured there was delivered, the choice 
stimulus was removed and thsj access stimuxus v^ras re— presented « The ^ 
tnen was instructed, "Now you do it, - and all Ss made three responses 
on ^-he access window. In some cases, it was necessary for ^ to point 
to the access window before ^ would responds The E remained until ^ 
emitted one complete access-choice response chain without assistance, 
leaving with the instruction, "Get whatever ^ou want, and INI be back 
when it‘s time to leave. “ 

Stimuli were programmed on a chain midt FR 3 FR 2 schedule, with 
three access and two choice responses, in sequence, required in the 
oresence of the appropriate stimuli before a reinforcer was delivered. 
Simultaneous responding on two or more windows was not reinforced, nor 
were alternating choice responses (one response on one choice window 
followed by one on another) nor operation of the choice windows in the 
presence of the access stimulus. 

The ratio on the choice response was changed from FR 2 to FR 3 when 
Remitted the second response within 5 sec of the first for five consecu- 
Uve choice trials, and did not alternate choice responses during any trial. 
Similar criteria were applied to access responses, and the access ratio 
was adjusted over the first three sessions to that value which resulted 
in each S receiving no more than 100 reinforcers per session; the access 
ratio remained fixed at that value for the remainder of the baseline ses- 
sions . Session length was fixed at 30 min in order to obtain a reliable 
sample of behavior within each session, and the maximum number of 
reinforcers established as 100 to avoid reinforcer satiation. 
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At the beginning of the second and all subseguent sessions, ^ led ^ 
to the experimental room saying, "Come with me." The ]E left at the door 
to the room with the instruction, "Get whatever you want, and 1*1.1 be 
back when it's time to leave." At the end of each session, E re-entered 
the room and said, "That's all for today." If necessary, E remained with 
S while reinforcers were put into a small bag which hsil been put in the 
reinforcer receptacle before the beginning of the session. 



RESULTS 

ever the first six sessions, Ss tend to distribute their reinforcer 
choice responses in one of two ways . A common pattern is one in which 
choices are distributed over the four reinforcers , with no clear preferences 
discernible. Within approximately six sessions, one reinforcer becomes 
more freguently selected, with the others decreasing to near-zero 
frequency or selected occasionally in some cyclical fashion. Figure 1 
is an example of this type of distribution. Initially, choices are dis- 
tributed over the four reinforcers , and the pattern of choices varies from 
one session to the next. By the fifth session, two trends may be 
observed - pennies are being selected with increasing frequency and there 
is a sharp decrease in the number of M & M choices. Subsequently, 
there is a systematic alternation of penny and M & M choices , with a 
maximum of approximately 25 M & Ms selected in any session. 

Figure 2 presents the same data plotted in terms of the per cent of 
total reinforcer choices per session; this describes relative choice 
behavior independently of the total number of reinforcers obtained . The 
same pattern may be noted - pennies are selected on approximately 
75 to 100 per cent of the occasions when a choice may be made, and 
M & Ms are selected on an alternate session basis, with no more than 
23 per cent of choices being M & M during a given session. When 
plotted independently of day-to-day variation in rate (the number of 
choice responses made per session), the data show a greater stability 
with respect to the distribution of choices over sessions. The cyclical 
nature of the distribution remains evident. 

A somewhat different picture may be seen in Figure 3. In this case. 
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Fig. 3. Frequency of reinforcer choices per session, ^ Wl. 







. 4^m 4.«fc f ' i O i X t . . 



iiiliiiiiiiiiiiii^^ 






REINFORCER HIERARCHIES 



9 



one reinforcer, pennies, was selected at every opportunity during the 
first session, and it was not until the sixth session that another 
reinforcer, M & Ms, was selected. Over these sessions, the total 
number of reinforcer choice responses made underwent a decrease as a 
function of manipulation of the access ratio. Rate then shows some 
leveling and the pattern of high frequency of selection of pennies with 
some alternation between low and zero frequency of M & M selection 
may be observed. This no longer may be seen after some 20 sessions. 

Tne same data are presented in Figure 4, and the same transition 
in behavior over the first few sessions may be noted. By the twentieth 
session, the distribution clearly has stabilized, with pennies being- 
selected some 98 to 100 per cent of the time for 10 successive sessions. 

Another example of this type of distribution is shown in Figures 
5 and 6. For the first seven sessions, only pennies were selected. As 
in the previous examples, a second reinforcer, in this case M & Ms , 
starts being selected with a relatively very low frequency , and there is 
some regular variation in this frequency from session to session. 

The second common type of distribution, then, is one which is 
characterized by a high frequency of selection of a particular reinforcer 
from the first session. A second reinforcer then develops as a low 
frequency choice, and there is a systematic, cyclic pattern to the 
selection of this reinforcer, usually an alternate day pattern. Figures 
7 and 8 are another example of this sort of distribution pattern. 



In the previous examples, cyclical changes in the choice responses 
have been regular, but have involved low-frequency behavior super- 
imposed on extremely regular high-frequency behavior . These examples 
may be characterized as instances of a stable and high preference for a 
particular reinforcer, with some form of alternating, low-frequency choice 
of a second reinforcer. Other subjects show far greater cyclicity. 

Figure 9 shows the behavior of an individual who for several sessions 
showed approximately the same frequency of choice for two reinforcers 
(M & Ms and pennies) and a regular, session-to-session alternation 
between these two. By the 27th session, the cyclicity remains strongly 
evident, but the frequency with which each reinforcer is selected shows 
marked change - the frequency of one (M & M) varies from zero to 
approximately 30, while the frequency of the other (penny) varies from 
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Fig. 5. Frequency of reinforcer choices per session, S J* 
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Fig. 7. Frequency of reinforcer choices per session, ^ B. 












Sessions 



Fig. 8. 



Per cent reinforcer choices per session, £ B. 



No. of Reinf. 




Sessions 



Fig. 9. Frequency of reinforcer choices per session, S Gl. 



16 



ORLANDO AND TYLER 



50 to 100. The cyclic variability remains, but there no longer is any over- 
lap between the two frequencies, and the behavior may be said to be 
developing a high degree oE stability with respect to a clear "preference" 
for a particular reinforcer. The same Lend may be observed in Figure 10, 
per cent of reinforcer choices independent of overall rate of choice responses. 

Figures 11 and 12 are another example of this type of behavior. Here, 
there is both session-to-scssion alternation between two reinforcers 
(M & Ms and pennies), and a cyclic phenomenon with a third reinforcer 
(Cheerios) which developed over a long period of time. Cheerios were 
selected some ^0 times early in the series of sessions and there was a 
gradual decrease to a level of approximately 10 to 15 selections per ses- 
sion for 20 successive sessions . By about the thirty-fourth session, the 
frequency had increased to a level of approximately 65 choices, and again 
was followed by a decrease in frequency. At this point, the data indicate 
that the leveling point may be somewhat higher than previously. In view 
of the fact that overall rate also appears to be undergoing a systematic 
decrease, this, in part, may account for the current level. However, 
when the data are examined with some control for rate. Figure 12, tho 
leveling at a higher value still se'=?ms evident. 

Figures 13 and 14 also are examples of cyclicity and a tendency for 
overall rate to decrease over a number of sessions. In this case, behavior 
which seemed to be fairly well distributed over three reinforcers for the 
first 20 sessions shows the partial breakdown of the distribution. What 
was a clear separation of pennies, 60 to 110 per session, M & Ms, 15 to 
50 per session, and Cheerios, zero to 5 per session, no longer is evident 
in the data from the last 20 of a total of 40 sessions . Again, an overall 
rate of reinforcer choice responses is correlated with this change in the 
distribution of choice responses. 

Finally, there is a small number of ^s who show some indications of 
a stable distribution only after many sessions. An example of this may be 
seen in Figures 15 and 16. In this instance, variability from session to 
session is high, with some sort of clear distribution possibly to be seen 
by the twenty-fifth session. However, it is not clear whether this represents 
the beginning of a stable high-frequency selection of trinkets that will 
continue, or whether it is comparable to the distribution as it appears 
around the fifteenth session, a short-term separation which was followed 
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by a return to a high degree of variability. Generally, the behavior may be 
described as a consistent "preference" for a particular reinforcer, trinkets, 
but the variability which can be seen in the other choices preclude, at 
present, definitive statements about a stabilized distribution - a clear 
hierarchy of reinforcers . 

A more extreme example may be seen in the behaviors depicted in 
Figures 17 and 18. Clearly, there is marked variability in the behavior 
from session to session. Although any one reinforcer may be followed 
over sessions and some regularity in its frequency of selection noted, it 
is not possible to observe any simple and consistent pattern which might 
be labelled as a stable hierarchy, and the day-to-day variability is of 
such a nature that additional data on this individual's behavior are 
required before any long-term cycles may be identified . 



DISCUSSION 

The results indicate the feasability of obtaining repeated measures of 
reinforcer choice behavior both within sessions and over a long series of 
sessions, and in a situation in which individual ^s are afforded an oppor- 
tunity to select from an array of reinforcers . The number and kinds of 
reinforcers in the array are limited only by restriction of the apparatus de- 
scribed to events which can be depicted graphically. Within this 
restriction, the possibility still exists for the inclusion of other reinforcing 
events, such as access to social stimuli, through the use of token reinforcers 
which could be exchanged for such social reinforcers. It thus appears 
feasible to evaluate a large number of reinforcing events for each individ- 
ual,. and to measure preference as a function of the number and kinds of 
events in the array. 

A difficulty in the assessment of relative effectiveness encountered 
in the present study is the fact that it appears that for most ^s one rein- 
forcer is more effective than all the others to the degree that responses 
primarily are made to this one reinforcer to the almost v^itual ©xclusion of 
the others. Exceptions to this tend to be cases in which a second reinforcer 
is selected with a frequency only slightly above the 10 or 15 per cent level, 
and on some sort of cyclical basis « Clearly, it is not possible to make a 
meaningful differentiation among the relative preferences for all reinforcers 
in each case. Thus, for individual analysis, it would seem to be advisable 
to manipulate the kinds of reinforcers available and other parameters , such 
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as r6spons6 ©ffort and th© amount of rsinforcsmsnt dsiivsrsd/ in ordsr 
to establish a clearly differentiated hierarchy of reinforcing events for 
individual ^s . 

The assessment procedure also is confounded with day-to-day 
variation in the number of reinforcers received per session. In a minor- 
ity of cases, the total number of reinforcers received per session 
varies within acceptable limits. More frequently, rate either increases 
to the point at which the rate is limited by the characteristics of the 
responding organism (responses cannot be emitted at a higher rate) or 
those of the apparatus (programming and recording components will 
function no faster), or there is a steady decrease in rate over sessions, 
and less behavior is sampled each session. 

In an attempt to control for day-to-day variation in rate, and to 
eliminate the possible confounding effects, measures and criteria are 
being developed to include within-and between-session adjustment to 
the individual's rate of responding. This is accomplished through 
repeated with-in session assessment of rate, and adjustment of the 
access ratio to that value which would result in the accumulation of 
80 to 120 reinforcers per 30-min session. Concurrently, criteria are 
being developed to determine the optimal degree of change in ratio . 

The goal is to determine measures and criteria related to rate of response 
that will be continuously adjusting to the behavior of the ^ and which 
will result in highly stable rates of behavior. 

It is obvious from the data collected thus far that repeated 
measures are essential. Assuming that the behavior of those ^s who 
show a high frequency of selection of a particular reinforcer is to 
some degree a function of the fact that a strong reinforcer is being 
compared with relatively weak ones, it may be the case that the sta- 
bility of choice responses would not be maintained were other 
reinforcing events added to the array . Their behavior then would be 
more similar to that of S_s who show some initial variability in terms of 
reinforcers chosen and the frequencies with which they are chosen. 

In the latter instance, it is clear that no hierarchical differentiation is 
discernible within the first few sessions . No statement based on the 
frequencies with which reinforcers are selected is meaningful in the 
context of a greater number of sessions: predictions of reinforcer 
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effectiveness based on these choices have little validity in a situation in 
which repeated sessions are aaministered . 

Any preference stabilization that occurs does so only after several 
sessions, and the number of sessions preceding stabilization varies for 
individual ^s . Further, the superimposed cycles of low frequency choice 
behavior on what might be called a high and stable preference for one 
reinforcer must be accounted for if behavior change as a function of the 
manipulation of a parameter or imposed variable is to be evaluated. In the 
present data, examples of both day-to-day and long-term cyclicity are 
evident, and appear to be the rule rather than the exception. 

This sort of variability in the data is apparent only with several repeated 
sessions. Assessment of its degree of regularity requires extensive and 
unconfounded measurement, and only in determining its regularity is it 
possible to separate those changes which are a function of cyclic regularity 
and those which are a function of variable manipulation. To the extent that 
refinements in methodology result in increased stability of behavior, the 
procedure described herein will serve as a stable baseline for the analysis of 
the parameters of reinforcer hierarchy and the relationships between these 
parameters and the acquisition and maintenance of complex behaviors . The 
sensitivity of the baseline remains to be formally demonstrated, but pilot 
data suggest that it is at least sensitive to such variables as the choice 
ratio (number of responses required on a choice window prior to the delivery 
of the reinforcer), and other variables will be evaluated as the methodology 
further is developed. 
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